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COVID-19 illness has a detrimental impact on the respiratory system, and 
the severity of the infection may be determined utilizing a selected imaging 
technique. Chest computer tomography (CT) imaging is a reliable diagnostic 
technique for finding COVID-19 early and slowing its progression. Recent 
research shows that deep learning algorithms, particularly convolutional 
neural network (CNN), may accurately diagnose COVID-19 using lung CT 
scan images. But in an emergency, detection accuracy simply is not enough. 
Determinants of data loss and classification completion time play a critical 
element. This study addresses the issue by finding the most efficient CNN 
model with the least data loss and classification time. Eight deep learning 
models, including Max Pooling 2D, Average Pooling 2D, VGG19, VGG16, 
MobileNetV2, InceptionV3, AlexNet, NFNet using a dataset of 16000 CT 
scans image data of COVID-19 and non-COVID-19 are compared in the 
study. Using the confusion matrix, the performance of the models is 
compared and together with the data loss and completion time. It is observed 
from the research that MobileNetV2 provides the highest accurate result of 
99.12% with the least data loss of 0.0504% in the lowest classification 
completion time of 16.5secs per epoch. Thus, employing MobileNetV2 gives 
the best and the quickest result in an emergency. 
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1. INTRODUCTION 


COVID-19 was derived from the virus known as a severe acute respiratory syndrome (SARS) or 
coronavirus2, commonly known as SARS-CoV-2 [1]. SARS-CoV-2 is a socially transmitted virus. While the 
majority of COVID-19 patients present with minor symptoms, a tiny proportion develop serious or life 
threatening complications. Contamination may result in pneumonia, excruciating respiratory pain, multiorgan 
failure, and death in a growing number of real instances [2]. A crucial and essential step in combating 
COVID-19 is an efficient screening of infected individuals, allowing for the isolation and treatment of 
positive patients. Currently, the major screening technique for COVID-19 detection is CT scan imaging of 
the lungs. The test is performed on the patient's chest and the result is ready within minutes. The lungs of 
patients with COVID-19 symptoms exhibit certain visual characteristics such as ground glass opacities-hazy 
darker patches that may distinguish COVID-19-infected individuals from non-infected patients [3], [4]. 
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A detection technique based on chest radiography images has a number of benefits over the traditional 
approach. It may be quick, evaluate many cases concurrently, increase availability, and, most significantly, 
such a system can be very helpful in hospitals that lack or have a limited number of testing kits and 
resources. Additionally, due to radiography's significance in today's health care system, radiology imaging 
equipment is available in every hospital, making radiography-based approaches more easy and accessible. 
Since 2020, there has been a rise in the number of publicly accessible CT scan images from healthy 
individuals, as well as those with Covid-19. This allows us to examine medical pictures and discover 
potential trends that may result in the illness being diagnosed automatically. Machine learning techniques for 
automated diagnosis have recently acquired favor in the medical sector as an auxiliary tool for professionals 
[5]-[9]. Deep learning, a prominent field of study in artificial intelligence (AI), allows the development of 
end-to-end models capable of achieving promised outcomes utilizing input data without requiring human 
feature extraction [10], [11]. Numerous research addresses the diagnosis of lung illness using artificial 
intelligence to analyze medical images. Artificial intelligence is a rapidly growing area dedicated to the 
creation of models from data, and its use in the development of techniques to help professionals in the 
interpretation of medical images has accelerated in recent years. Transfer learning, in particular, is 
developing as a deep learning technique in which a model created for one task is utilized as the starting point 
for a model on a second task. Recent efforts have shown potential in enhancing detection in a variety of 
medical fields, including kidney cancer detection, Lungs cancer, and breast cancer detection. Nowadays, 
multiple pre-trained deep learning models are utilized to detect and predict COVID-19 from CT scans or X- 
ray images. In this study, eight deep learning models are used to detect the COVID-19 on chest CT scan 
images and compared their accuracy, data loss and compilation time. 


2. LITERATURE REVIEW 

Researchers have been drawn to the COVID-19 classification to build algorithms to deal with this 
new problem. It's no secret that digital image processing algorithms have been utilized extensively in 
medicine to demonstrate their efficacy with acceptable outcomes. For this reason, these algorithms have been 
among the most popular approaches to finding a solution. Because a trustworthy method for diagnosing this 
viral illness is urgently required. Many new methods have recently been developed to identify and diagnose 
illness in its early stages to save the lives of the people suffering from it. For example, organ segmentation, 
disease identification and categorization, prediction, and more may be aided by image processing algorithms 
in the healthcare sector. 

Seum et al. [12] performed a qualitative study to examine the performance of CNN architectures 
DenseNet169 and DenseNet201 in identifying COVID-19 from CT scan pictures. The U-Net segmentation 
technique is examined in this study to determine the performance of CNN models. The dataset, SARS-COV- 
2 CT-Scan, contains a record of 2481 CT scan images. DenseNet169 architecture obtained an accuracy of 
89.31% without using the segmentation method, whereas DenseNet201 model achieved an accuracy of 89.67% 
using U-Net. 

Polsinelli et al. [13], a light CNN architecture based on the Squeeze Net method is suggested to 
efficiently classify COVID-19 CT images from those of other patients suspected of having pneumonia and 
healthy individuals. The approach provides an accuracy of 85.03% during the first dataset layout and around 
3.2% inside the second dataset layout. 

Mishra et al. [14] used Transfer Learning to build an algorithm for detecting COVID-19 from CT 
scan images classified as Healthy (Normal), COVID-19, and Pneumonia. This article employs data 
augmentation and fine-tuning methods to enhance and optimize the VGG16 and Res-Net50 models, resulting 
in an average classification accuracy of 86.74% and 88.52%, respectively. 

Naeem and Bin Salem [15] describes how a combination of deep learning and multi-level feature 
extraction methodology is used to obtain COVID-19 classification using the CT scan and chest X-ray. GIST, 
SIFT, and CNN are used in this method to extract features from image data. The experimental findings show 
that the proposed method obtained an accuracy of 98.94%. 

The suggested technique from Kundu ef al. [16] involves an ensemble method that utilizes the 
Gompertz function to generate fuzzy rankings for the basic classification models and adaptively fusing the 
base models’ decision scores to construct predictions. Three transfer learning-based CNN models are being 
used to generate the decision scores for the proposed ensemble model that is VGG-1, WideResNet0-2, and 
InceptionV3. The ensemble method achieves 98.93% and 98.79% accuracy rates on the SARS-COV-2 and 
Harvard Data verse chest CT datasets, respectively. 

Deep learning techniques based on CNN were used in [17] to classify COVID-19 and non-COVID- 
19 CT scan images. CTnet-10 was developed to detect COVID-19 with an accuracy of 82.1%. Additionally, 
different models including such DenseNet-169, VGG-16, ResNet-50, InceptionV3, and VGG-19 were 
assessed, with the latter showing to be better with an accuracy of 94.52%. 
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Most of the studies suggested or used deep learning techniques to identify, predict, and classify 
COVID-19 from CT scans and X-ray images of the chest. Table 1 demonstrations the summary of relevant 
research. This study uses deep learning techniques to classify the COVID-19, and the best models with the 
highest accuracy, lowest data loss, and shortest compilation time are identified. 


Table 1. The overview of the related studies 


Paper Dataset Type Source Class Model Accuracy 
u2 CT scan dataset Kaggle.com 2 DenseNet169 89.31% 
DenseNet201+ U-Net 89.67% 
[13] CT scan dataset Github.com 2 Custom Squeeze Net 85.03% 
[14] CT scan dataset Kaggle.com + sirm.org 3 VGG16 86.74% 
ResNet50 88.52% 
[15] CT scan + X-ray dataset Kaggle.com + sirm.org 2 GIST+ SIFT+ CNN 98.94% 
[16] CT scan dataset Github.com 2 VGG11 + 98.93% 
CT scan dataset Harvard Dataverse WideResNet502 + InceptionV3 98.79% 
(17) CT scan dataset Github.com 2 CTnet-10 82.1% 
VGG-16 94.52% 


3. METHODOLOGY 

COVID-19 detection is performed in this study using the categorization of Chest CT scan images of 
the lungs. CNN architectures are used to classify images. To determine which architect performs the best at 
identifying COVID-19 infected CT scan images, a confusion matrix of implemented models is constructed 
and compared. Additionally, a variety of performance parameters are accessed through the confusion matrix. 
Finally, in order to determine which model is the most efficient, three measures are compared: accuracy, rate 
of data loss, and classification completion time per epoch for the models. The study's underlying idea is to 
find the most efficient CNN model by identifying the model with the greatest accuracy and the lowest data 
loss rate in the shortest classification completion time. Figure 1 depicts the study's flow diagram. 
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Figure 1. System architecture of proposed study 


3.1. Classification models 
3.1.1. Convolution neural network (CNN) architecture 

CNN is a machine learning method that utilizes deep learning. It takes an input image and weights 
different elements, allowing it to identify one image from another [18]-[21]. The model utilizes two 
convolutional layers, with convolutional 2D layers in each. In both convolutional 2D layers, 'Relu activation' 
is utilized. It implements two Dense Layers for complete connectivity and employed 'Relu activation’ for the 
first dense layer and 'Sigmoid activation’ for the second dense layer. Apart from these levels, there are several 
hidden layers and an input layer. The model implementes two pooling layers: Max Pooling 2D and Average 
Pooling 2D. 
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3.1.2. Max pooling 

Max pooling is used to assist in overfitting by giving an idealized version of the representation. It 
also reduces calculation time by lowering the amount of variables to learn and provides basic internal state 
performance. It performs a pooling procedure to find the largest feature map component. As the output pixel 
count decreases, the dimension of pictures decreases as well. 


3.1.3. Average pooling 

The filter's region of the feature map is used to pick the average element, which is a pooling process. 
Each value is added to an average and then fed to the next layer. That all data are used for feature mapping 
and output creation, which is a highly general calculation. 


3.1.4. VGG19 

VGG19 is a VGG model version comprised of 16 convolution layer, three fully connected layers, 
five Max Pool layers, and one Softmax. The feed to this network was a fixed-size RGB picture with a matrix 
with same size. Max pooling is accomplished using stride 2 across a 2x2 pixel window. Rectified linear units 
(ReLu) are used to incorporate non-linearity into models, which improves classification and computing 
performance. Three completely interconnected layers were created. Finally, as the model's final layer, there is 
indeed a softmax function. 


3.1.5. VGG16 

VGG16 uses 1x1 convolution filters, which may be thought of as a linear modification of the input 
channels. The input to the layer is chosen in such a way that the spatial resolution is maintained after 
convolution. In this model, spatial pooling is accomplished by using five max pooling layers that follow 
many of the conventional levels. Stride 2 is used to cover a 2x2 pixel frame when max pooling is applied. 
Three fully connected (FC) layers are inserted after a series of convolutional layers. The softmax layer is the 
last one. Layers 1 and 2 are always configured the same way in all networks. 


3.1.6. MobileNetV2 

MobileNetV2 is a CNN architecture optimized for mobile devices. MobileNetV2's first fully 
convolutional layer has 32 filters. There are 19 recurrent bottleneck layers. It is used to classify images, 
identify objects, and perform quantization. Two types of blocks are introduced in MobileNetv2. 
- Residual block of stride 1. 
- Block for downsizing with 2 stride. 

Both blocks are made up of three layers. The first layer employs the ReLU6 activation function with 
1x1 convolution. On the second layer, a depthwise convolution is performed, and the third layer is likewise a 
1x1 convolution, except for any non-linearity. The third layer also uses the ReLu activation function 
MobileNetV2 performs well with fewer mathematical operations and a small number of parameters. It is 
about 35% quicker than its predecessor, MobileNet V1. 


3.1.7. Inception V3 

When Google first demonstrated their Inception Neural Network Model in the ImageNet 
Classification Competition, it was called InceptionV3. The model is constructed using symmetrical and 
asymmetrical building elements such as convolution layers, pooling layers, concatinations, dropout, and 
fully-connected layers. This allows for the identification and incorporation of information from smoothed 
label sequences utilizing the RMSProp Optimizer and Factorized 7x7 Convolution, as well as the use of the 
BatchNorm in Auxillary Classifiers and a downscaling classifier. 


3.1.8. AlexNet 

The AlexNet consists of 8 layers, each with its own set of learnable parameters. The model consists 
of 5 layers, the first of which is a max pooling layer followed by three fully connected layers; each of these 
levels, save the output layer, uses Relu activation. Using the Relu as an activation function resulted in a 
nearly six-fold increase in the speed of the training process. Additionally, utilizing dropout layers keep the 
model from overfitting. 


3.1.9. NFNet 

DeepMind created NFNets to eliminate the need for normalization and boost training performance. 
Additionally, it adds a method called adaptive gradient clipping (AGC), which enables fast training of neural 
network models such as ResNet with higher batch size. The primary advantage of AGC is that it eliminates 
this hyperparameter. Along with AGC, dropout is utilized to mimic the regularization effect that Batch 
normalization provided. 
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3.2. Performance evaluation 

The scientific community has agreed on a number of criteria for evaluating the classification 
system's quality [22]-[24]. The confusion matrix is used to assess the study's success using the following key 
parameters: true-positive (TP), true-negative (TN), false-positive (FP), and false-negative (FN) where, 

- TP represents COVID-19 classified by the models. 

- TN indicates models that are not classed as COVID-19. 

- FP indicates non-COVID-19 that the models have classified as COVID-19. 
- FN denotes COVID-19 classified as non-COVID-19 by the models. 

Validity metrics such as accuracy, sensitivity/recall, specificity, Fl-score, precision/positive 
predicted value (PPV), negative predicted value (NPV), false-negative rate (FNR), false-positive rate (FPR), 
false discovery rate (FDR), false omission rate (FOR), and Matthews correlation coefficient (MCC) can be 
calculated using these parameters [25]-[30]. The mathematical formulas for these measurements are as shown 
in (1) to (11). 


TP+TN (1) 


Accuracy = —————— 
Y = TPHFN+FPATN 


From (1), Accuracy is the ratio of properly predicted observations to total observations. 


Sensitivity = EA (2) 
ae TN 
Specificity = =; (3) 


Specificity and sensitivity in (2) and (3) are used to classify data into two groups. Sensitivity is defined as the 
true positive rate, while specificity is defined as the true negative rate. 


Precision X Recall 
Precision+Recall ) (4) 


F1 — score = 2( 


By calculating the harmonic mean of the precision and sensitivity of a classifier, the Fl-score in (4) 
integrates both into a single measure. 


= TP 
Precision = =; (5) 


Precision in (5) refers to a classification model's ability to identify only relevant data items. 


TN 
AP TN+FN (6) 


In (6), NPV refers to the percentage of anticipated negatives that are truly negative. It expresses the 
likelihood that a projected negative value is a real negative value. 


FNR = —A_- (7) 
FN+TP 


FNR refers to the rate of determining truly positive negatives. As shown in (7) expresses the chance 
that an anticipated negative value is in reality a positive value. 


FPR = — (8) 


FP+TN 


In (8), FPR refers to the rate of classifying a real negative as a positive. 


FP 
EDR FP+TP (9) 


FDR in (9), is the percentage of ideas that all beliefs are true when they in fact false. It is the chance that all 
reject the null hypothesis erroneously. 


FN 
FOR = FN+TN (10) 
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The FOR in (10) is the percentage of people who have a negative test result but have a positive actual 
disease. 


TPXTN-FPXFN 
ee ((TP+FP)(TP+FN)(TN+FP)(TN+FN) ) (11) 
As shown in (11) MCC is a quality metric for binary classification. where 1 represents a perfect agreement, 0 
represents a prediction that is just random, and -1 represents the complete conflict between prediction and 
real observation. 


4. RESULT AND DISCUSSION 
4.1. Dataset preparation 

A database of CT scan images is used in this work, which is publicly available in [31]. The dataset 
contains 749 images, 397 images of Non-COVID-19 (healthy lungs), and 349 images of COVID-19 are 
shown in Figure 2. Since not all of the images were the same size, resized all the images to 224x224 pixels. 
As deep learning architectures perform better with more data, the ImageDataGenerator function is used to 
expand the size of the dataset and create more augmentation images. The ImageDataGenerator's parameters 
are shown in Table 2. 

Following that, all images are transformed to NumPy arrays to speed up computation. COVID-19 
and NON-COVID-19 are determined using the LabelBinarizer() and categorical methods. The augmented 
dataset (containing 16000 images) is divided into a training set and a test set at a ratio of 80:20. The final 
dataset's details are shown in Table 3. 


Table 2. Augmentation parameters 


Rotation Zoom Width Shift Height Shear Horizontal Vertical Flip Mode 
Range Range Range Shift Range Range Flip 
20 0.15 0.2 0.2 0.15 TRUE TRUE Nearest 


Figure 2. Data set classified sample (CT image of COVID-19 and Non-COVID-19) 


Table 3. Final dataset description (After Augmentation) 


Variable Speed (rpm) 

Total Number of Images 16000 
COVID-19 CT Image 8000 
Healthy (Non-COVID-19) CT Image 8000 

Dimension (Size in Pixel) 224x224 pixels 

Disease Types 2 

Training Images 12800 
Testing Images 3200 


4.2. Result analysis 

The suggested study used CNN architectures to classify COVID-19 and non-COVID-19 from Chest 
CT scan images. 16000 image data are used after preprocessing to gain the best classification result. Here 8 
CNN architectures are used to identify the image data. The architectures are CNN Max Pooling, CNN 
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Average Polling, MobileNetV2, VGG16, VGG18, InceptionV3, and NFNet. Each architecture is trained and 
tested on 100 epochs using the RMSProp optimizer with a learning rate of 0.0000001. The outcome of the 
models is recorded and assessed to gain values of the confusion matrix. The confusion matrix obtained for 
each architecture is illustrated in the Figure 3. 

In the confusion matrix the cell (1,1) represents TP, (0,1) represents FP, (0,0) represents TN and 
(1,0) represents FN. As stated earlier, half of the image data in the dataset are of COVID-19 and the other 
half are of non-COVID-19. Therefore, it can be observed from the confusion matrix that all the models could 
successfully identify the classes with high accuracy. However, among the models, MobileNetV2 can 
successfully identify the most percentage of COVID-19 and non-COVID-19 data accurately with the least 
percentage of an incorrect prediction. Among the 50% of COVID-19 images, MobileNetV2 efficiently 
identified 49.9% of data and 49.22% of data in 50% of non-COVID-19 images. 

Using the elements obtained from the confusion matrix and (1) to (11), the performance of the 
models are measured and recorded in Table 4. From the table, it is seen that MobileNetV2 achieves the 
overall highest accuracy of 99.12%. The second highest accuracy is derived from VGG19 with 98.25%. It is 
followed by AlexNet and Max Pooling with an accuracy of 97.81% and 97.52%, respectively. Furthermore, 
MobileNetV2 only achieved the highest accuracy, but the highest Sensitivity, Specificity, F1-Score, 
Precision, NPV and MCC scores with 98.65%, 99.59%, 99.12%, 99.6%, 98.63% and 0.9824% respectively. 


Max Pooling Average Pooling VGG16 


[a> E 
E 
0 1 1 
InceptionV3 AlexNet 
0.87 
1 
Figure 3. Confusion matrix of the models 
Table 4. Performance evaluation of CNN Models 
Criteria for CNN Models 
Evaluation MobileNetV2  VGG19 Max Pooling Average Pooling VGG16 NFNet  AlexNet Inception V3 
Accuracy 99.12 98.25 97.52 95.48 93.46 96.43 97.81 94.75 
Sensitivity 98.65 98.4 97.75 95.69 92.97 95.96 97.41 95.56 
Specificity 99.59 98.08 97.3 95.26 93.94 96.89 98.21 93.94 
F1-Score 99.12 98.26 97.52 95.48 93.46 96.43 97.81 94.77 
Precision 99.6 98.11 97.3 95.27 93.95 96.91 98.21 93.99 
NPV 98.63 98.38 97.75 95.68 92.97 95.95 97.41 95.52 
FNR 1.34 1.59 2.24 4.3 7.02 4.03 2.58 4.43 
FPR 0.4 1.91 2.69 4.73 6.05 3.1 1.78 6 
FDR 0.39 1.88 2.69 4.72 6.04 3.08 1.78 6 
FOR 1.36 1.61 2.24 4.31 7.02 4.04 2.58 4.47 
MCC 0.9824 0.965 0.9505 0.9096 0.8692 0.9286 0.9562 0.8951 


The most efficient performance of architecture depends on the accuracy of classification, 
classification completion time, and data loss rate. An efficient classification algorithm is characterized by its 
high accuracy rate in a low completion time with a low rate of data loss shown in Figure 4. From the 
recorded data of each model, it can be seen that from Figure 4(a), the accuracy of MobileNetV2 is the 
highest. From Figure 4(b), the rate of data loss is the lowest of VGG19. From Figure 4(c) the lowest 
classification completion time belongs to Max Pooling. However, putting the three factors together it is found 
that overall MobileNetV2 has the lowest completion time and lowest rate of data loss with the most accuracy. 
Though VGG19 has the lowest rate of data loss, it has a higher completion time and lower accuracy 
compared to MobileNetV2. Likewise, Max Pooling has the lowest completion time but a much poorer 
accuracy than MobileNetV2. The accuracy, rate off data loss and classification completion time per epoch for 
the MobileNetV2 are 99.12%, 0.0504%, 16.5secs/epoch. 


Indonesian J Elec Eng & Comp Sci, Vol. 26, No. 1, April 2022: 462-471 


Indonesian J Elec Eng & Comp Sci ISSN: 2502-4752 m) 469 


Classification Accuracy 


Accuracy 
© 
wi 


x x ‘a E 
Ps cg se sa E O S Ca 
RO ec TY es °* 
3 D $ $ 
we A s| x 
Models 
(a) 
Data Loss Completion Time 
0.5 90 
0.45 80 
04 70 
0.35 5 
8 03 ¥ 
% 0.25 Tae 
Z 02 z 40 
0.15 30 
0.1 20 
0.05 10 
° 0 
a S om SF SF wo MD YO 
2 y xs x = ss = a% & 5 ‘a J 
Š Ka es Er FF & SS F Pa s S & 
$ $ E PAS ES $ 
e O S 
Ca O s K 4 
Models X 
Models 
(b) (c) 


Figure 4. Performance of the models represents in (a) represents the highest accuracy of MobileNetV2, 
(b) represents the lowest data loss VGG19, and (c) represents the lowest classification completion time of 
max pooling 


5. CONCLUSION 

COVID-19's early diagnosis has been considered difficult because of the disease's potential to 
spread across society. The diagnostic procedure may be more precise and faster using deep learning methods 
and soft computing abilities. This study illustrated eight deep learning models that could help diagnose 
COVID-19 automatically. But the MobileNetV2 model has produced better accuracy than other models with 
the average data loss and compilation time. Though VGG19 has the lowest data loss rate, it has a higher 
completion time and lowers accuracy than MobileNetV2. Future studies will need the development of a 
hybrid deep learning method that can evaluate and perform on the high amount of images and determine how 
much of the lung's volume is infected. 
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